Linear Precoding Designs for Amplify-and-Forward Multiuser 

Two- Way Relay Systems 

Rui Wang, Meixia Tao, Senior Member, IEEE and Yongwei Huang Member, IEEE 



Abstract — Two-way relaying can improve spectral efficiency in 
two-user cooperative communications. It also has great potential 
in multiuser systems. A major problem of designing a multiuser 
two-way relay system (MU-TWRS) is transceiver or precoding 
design to suppress co-channel interference. This paper aims to 
study linear precoding designs for a cellular MU-TWRS where a 
multi-antenna base station (BS) conducts bi-directional commu- 
nications with multiple mobile stations (MSs) via a multi-antenna 
relay station (RS) with amplify-and-forward relay strategy. The 
design goal is to optimize uplink performance, including total 
mean-square error (Total-MSE) and sum rate, while maintaining 
individual signal-to-interference-plus-noise ratio (SINR) require- 
ment for downlink signals. We show that the BS precoding design 
with the RS precoder fixed can be converted to a standard 
second order cone programming (SO CP) and the optimal solution 
is obtained efficiently. The RS precoding design with the BS 
precoder fixed, on the other hand, is non-convex and we present 
an iterative algorithm to find a local optimal solution. Then, the 
joint BS-RS precoding is obtained by solving the BS precoding 
and the RS precoding alternately. Comprehensive simulation 
is conducted to demonstrate the effectiveness of the proposed 
precoding designs. 

Index Terms — MIMO precoding, two-way relaying, non- 
regenerative relay, minimum mean-square-error (MMSE), con- 
vex optimization. 

I. Introduction 

Due to complex wireless propagation environments, such 
as multi-path fading, shadowing and interference, the signals 
received by a remote destination receiver are not always strong 
enough to be decoded correctly. This problem has been consid- 
ered as a main obstacle in the development of modern wireless 
communication systems. Recently, relay assisted cooperative 
communication has been proposed as an efficient way to deal 
with this problem, which now has received great attention from 
both academia and industry. One example of the relay assisted 
cooperative communication is one-way relay system, which 
has been well studied in past decade Q, O. Although it has 
shown great potential in for example, transmission reliability, 
energy saving and coverage extension, one-way relaying on 
the other hand reduces spectral efficiency due to half-duplex 
constraint. 

A promising technique to improve spectral efficiency of 
one-way relaying is to apply network coding f3), resulting in 
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two-way relaying which has now attracted great attention fl4)- 
H . Two-way relaying applies the principle of network coding 
at the relay node so as to mix the signals received from the 
two source nodes who wish to exchange information with each 
other and then employs at each destination self-interference 
(SI) cancelation to extract the desired information. Compared 
with traditional one-way relaying, spectral efficiency of two- 
way relaying can be significantly improved since only two 
time slots instead of four time slots are needed to complete 
one round of information exchange. 

In this work, we consider two-way relaying in multiuser 
systems. As in traditional multiuser systems, it is crucial to 
mitigate co-channel interference (CCI) for multiuser two-way 
relay system (MU-TWRS). An advanced method to suppress 
CCI is to apply multiple-input multiple-output (MIMO) tech- 
nique. Therein, transceiver or precoding should be carefully 
designed at each multi-antenna station, especially at the relay 
station (RS) O-O. In 1, 0, authors study linear relay 
precoding for MU-TWRS with decode-and-forward (DF) relay 
strategy. Since the received signals is fully decoded in the first 
time slot, the relay precoding only affects the transmission 
in the second time slot. Then, by using zero-forcing (ZF) 
precoding, the relay precoding studied in O, reduces to 
a power allocation problem. The amplify-and-forward (AF) 
relay precoding, however, differs considerably from DF case 
as the transmissions of the first and second time slots are 
tightly coupled and hence is more challenging. Using ZF 
and minimum mean- square-error (MMSE) criteria, authors in 
ifTQl-lfnl study precoding design for an AF based MU-TWRS 
with multiple pairs of users. In particular, the explicit and 
analytical results are derived in lfl~3l for system performance 
evaluation. Relay precoding design for the AF based MU- 
TWRS with multiple pairs of users is also considered in 
our previous work [14]. Unlike [T0l- fl3l . we do not impose 
any structural constraint on the relay precoder and thus the 
obtained results can approach the optimal performance fl4l . 
In ITT5L authors study an AF MU-TWRS model with one base 
station (BS) and multiple mobile stations (MSs). By using ZF 
precoding scheme, explicit analytical results are also provided 
as in [13]. It is worth noting that the aforementioned ZF 
based precoding designs all impose certain constraints on the 
number of relay antennas which may not be available for some 
scenarios. 

In this paper, we consider linear precoding design for a 
cellular MU-TWRS where a multi-antenna BS intends to 
conduct bi-directional communications with multiple MSs via 
a multi-antenna RS. Our work differs from in that we 
adopt AF relay strategy rather than DF for its simplicity in 
practical implementation. However, as mentioned previously, 
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the precoding design with AF relay strategy is more chal- 
lenging. Our work is also different from 031 since we do 
not impose any structures on precoders. Our design goal is 
to enhance uplink performance subject to individual signal-to- 
interference-plus-noise ratio (SINR) requirement for downlink 
signals. Specifically, total mean- square error (Total-MSE) and 
sum rate are chosen to measure the performance of uplink. 
Since linear precoding can be employed at the BS, RS or 
both, three associated optimization problems are considered. 
When precoding is only conducted at the BS with the RS 
precoder fixed, we show that this optimization problem can 
be converted to a standard second-order cone programming 
(SOCP), thus the optimal solution can be obtained efficiently. 
The RS precoding with the BS precoder fixed, on the other 
hand, is non-convex and we present an iterative algorithm to 
find a local optimal solution. Thirdly, we obtain the joint BS- 
RS precoding design by solving the BS precoding and the 
RS precoding alternately, the convergence of which is guaran- 
teed. Simulation results show that the RS precoding scheme 
outperforms the BS precoding scheme in most cases and the 
joint precoding scheme outperforms the individual precoding 
scheme. Besides performance, practical implementation issues, 
including signaling overhead and design complexity, for the 
proposed precoding designs are also discussed and compared. 

The rest of the paper is organized as follows. In Section 
II, we present the system model. Different precoding designs 
are presented in Section III. In Section IV, we discuss the 
overhead and design complexity. Extensive simulation results 
are illustrated in Section V. Finally, we conclude the paper in 
Section VI. 

Notations'. £(•) denotes the expectation over the random 
variables within the brackets. <g> denotes the Kronecker oper- 
ator. Tr(A), A -1 , det(A) and Rank(A) stand for the trace, 
inverse, determinant and the rank of a matrix A, respectively, 
and Diag(a) denotes a diagonal matrix with a being its 
diagonal entries. Superscripts (-) T , (•)* and (-) H denote the 
transpose, conjugate and conjugate transpose, respectively. 
OjvxM implies the TV x M zero matrix. In denotes the N x N 
identity matrix and I NxM = [1^, 0^ v _ M)xM ] T if N > M. 
| |x| 1 1 denotes the squared Euclidean norm of a complex vector 
x and | |X| \ 2 F denotes the Frobenius norm of a complex matrix 
X. | z | implies the norm of a complex number z, ^t(z) and 
$s(z) denote its real and imaginary part, respectively. C xxy 
denotes the space of x x y matrices with complex entries. The 
distribution of a circular symmetric complex Gaussian vector 
with mean vector x and covariance matrix X) is denoted by 
£A/*(x,£). 

II. System Model 

Consider a multiuser two-way relay system where an TV- 
antenna BS conducts bi-directional communication with K 
single-antenna MSs under the assistance of an M-antenna 
RS. For effective multiuser transmission, we let N > K and 
M > K. Moreover, we assume that all the MSs are cell- 
edge users. Thus, due to impairments such as multipath fading, 
shadowing and path loss of wireless channels, the direct-path 
link between the BS and each MS is ignored. It is also assumed 
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Fig. 1. Illustration of a cellular MU-TWRS. 

that the RS operates in half-duplex mode. That is, it cannot 
transmit and receive simultaneously. 

The bi-directional (i.e., uplink and downlink) communica- 
tions take place in two time slots as shown in Fig. Q] In the 
first time slot, also referred to as multiple-access (MAC) phase, 
both the BS and MSs simultaneously transmit their signals to 
the RS. The received M x 1 signal vector at the RS can be 
written as 

K 

y R = HiXb + ^2 h ^Sk + n R , 

k=l 

where x^G C Nxl represents the transmit signal vector from 
the BS, Sk denotes the transmit signal from the MS k. We 
assume that the transmission power at the MS k is P k , i.e., 
£(s k s%) = P k . Hi e C MxN is the MIMO channel matrix 
from the BS to the RS, h 2fe e £ Mxl is the channel vector 
from the MS k to the RS, and n R denotes the additive noise 
vector at the RS following CAF(0, cr^I M ). Here x# can be 
further expressed as 

X£> BS£>, 

where sb G C Kxl with ^(s^s^) = Ik is the modulated 
signal vector from the BS, B = [bi, b 2 , • • • , b K ] e C NxK 
denotes the transmit precoding matrix at the BS. Furthermore, 
the maximum transmission power at the BS is assumed to be 
Pb, i.e., 

Tr(BB^) < P B . (1) 

Upon receiving the superimposed signal y#, the RS per- 
forms linear processing by multiplying it with a precoding 
matrix F £ ^MxM anc j t ] ien f orwar( j s ft m the second time 
slot, also referred to as broadcast (BC) phase. Therefore, the 
Mxl transmit signal vector from the RS is given by 

K 

Xi? = Fy^ = FHix B + Fh 2k$k + Fn^. 

k=l 

The maximum transmission power at the RS is given by Pr, 
which yields 

Tr {F (HxBB^Hf + H 2 PP^Hf + a 2 R I M ) F H } < P R , 

(2) 

where we define P = Diag(v // A, \fPi, • • • , VPr) and H 2 = 
[h 2 i, h 2 2, • • • , h 2 x]. Then the received signals at the BS and 
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MS k after the BC phase can be written as 

K 

Yb = Y1 G i Fh 2fe^ + GiFHiBsB + GiFn^ + n B 



k=l 



(3) 



= GiFH 2 s M + GiFHiBsb + GiFn^ + n B , 



K 



K 



Vk 



2=1 



Here, sm = [si, «2j • • • 5 sx] > denotes the z-th entry in 
s B , Gi G C NxM and g 2/e G C Mxl are the channel matrix 
and vector from the RS to the BS and MS k, respectively, 
ub and rik denotes the additive noise at the BS and MS k, 
respectively, with rig ~ CA/"(0, cr^I/v) and ~ £/V(0, aj?). 
Note that both the BS and MS k know their transmit signals 
sb and Sk, respectively. Therefore, the back propagated self- 
interference terms s# and Sk can be subtracted from ® and 
Q, respectively. The equivalent received signals at the BS and 
MS k are yielded, respectively, as 



y B = GiFH 2 s M + GiFn^ + n B , 



(5) 



Vk 



(6) 



gafcFHibfcSBfe + ^giifeFHibiSBt 

+ g2fc Fh 2i-§i + g^Fn^ + rife- 

From ©, we find that the received downlink signal at 
each MS not only consists of the CCI from the downlink 
transmission (i.e., the second term), but also the CCI from 
the uplink transmission (i.e., the third term). The downlink 
performance of each MS can be measured by SINR given by 



SINR/, 



E^g^FHib^^^ 
k = 1,2,- •• ,K. 

(7) 

As for the uplink transmission in ([5]), it can be viewed as a 
MIMO multiple-access channel. Depending on different per- 
formance requirements, various metrics can be used to evaluate 
its performance. Our first objective aims to minimize the Total- 
MSE of all the MSs by assuming linear minimum mean- 
square error (MMSE) receiver at the BS. Using Total-MSE for 
precoding design has been widely studied in multiuser systems 
ED, HD, ED-CDQ. R y minimizing MSE 



e = £ SM (\\Wy B 



SmIII) 



(8) 



with respect to the decoding matrix W, the minimum Total- 
MSE is given by OH 



(9) 



where E = I K + P^H^F^Gf (a^GiFF^Gf 
ctIIat^GiFHsP and the optimal W in ® is 

W =P^Hf F^Gf (GiFH 2 PP^Hf F^Gf 



+^GiFF H Gf + v B l N 



(10) 



Our second objective aims to maximize the sum rate of 
the uplink transmission. By applying successive interference 
cancelation (SIC) and linear MMSE filter at the BS, the sum 
rate at the BS is given by 



r =0.5 log 2 det (l K + P^Hf F^Gf 

(a^GxFF^Gf + 4l iV )- 1 G 1 FH 2 P) 



(11) 



where the factor 0.5 is due to the fact that the MSs use 
two time slots to complete the uplink transmission. Note that 
(ITTb can be re-expressed as r = 0.5 log 2 det(E) with E 
defined in ©. We will see that the precoding designs proposed 
for Total-MSE minimization can be extended for sum rate 
maximization. 

III. Linear Precoding Designs 

From Section II, it is seen that the downlink performance 
of each MS depends on both the BS precoder B and the 
RS precoder F. While for the uplink transmission, it is only 
related to the RS precoder F, thus less design freedom can 
be exploited compared with the downlink. In theory, the BS 
precoder B and the relay precoder F should be jointly de- 
signed such that the downlink and uplink performance can be 
optimized simultaneously. However, there is no single figure 
of merit to measure the overall performance of the multiuser 
bidirectional transmission. In this paper, we choose to ensure 
the downlink quality-of- service (QoS) for each individual MS 
while at the uplink minimizing the Total-MSE or maximizing 
the sum rate of all the users. This is because in practice the 
downlink data traffic usually is more dominant than the uplink 
traffic. As such, the optimization problem is formulated as 



mm 

B,F 



(12) 



s.t. 



SINR fc > A fc , k = 1,2, 
Tr(BB^) < P B 

Tr {F (HiBB^Hf + H 2 PP^Hf 

+4l M )F H } <P R 

where is a preset threshold for the MS k. 

Since linear precoding can be conducted at the BS, RS 
or both, three associated precoding designs are considered 
respectively in the following three subsections. Note that 
for each design, the system needs different computational 
complexity and signaling overhead, such that they are suitable 
to different scenarios. 

A. BS precoding 

In this subsection, we assume that precoding is only em- 
ployed at the BS, while the RS precoder is given as F = aF 
where F is an arbitrary fixed precoder applied at the RS, and 
a is a non-negative scalar used to scale the received signals 
at the RS to satisfy relay power constraint. Note that besides 
maintaining the downlink SINR, a properly designed B can 
reduce the RS power consumption by the signal s b from the 
BS. Then the uplink transmission can share more power at the 
RS, which is helpful for improving its performance. 
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The optimization problem can be formulated as: 

min fi(a) or - f 2 (a) 

B,Q! 

s.t. pk > A/c,V/c 



(13) 



HoPP^H 



HtjH 



Tr(BB^) < P B 
Tr |a 2 F (HiBB^Hf 

o- 2 R I M )F H } <Pr 

where h{a) = Tr (E(a)- 1 ) and f 2 (a) = log 2 det (E(a)) 
with E(a) = l K + a 2 P^Hf F^Gf (a|a 2 GiFF^Gf + 
(7|Ijv)" 1 GiFH 2 P and 



Pk 



a^g^FHxbfcl 



,+aV|||g^F|P+ar 

where £ = Y,i^ 2 \^k^M 2 + a 2 P,|g^Fh 2 ,| 2 ). To 
proceed to solve dT3b . we first give the following lemma, the 
proof of which is given in Appendix lAl 

Lemma 1: fi(a) and —/2(a) are monotonically decreasing 
functions with respect of a. 

Based on Lemma 7, it is easy to see that minimizing fi(a) 
or —/2(a) in (ITSt is equivalent to maximizing the scalar a. 
By defining B = aB, problem (fT3l) can be re-expressed as: 



max a 

B,a 



(14) 



s.t. Tr(BB^) < a 2 P B 

Tr(FHiBB^HfF^) 
+a 2 Tr(F(H 2 PP^Hf + cr 2 R l M )Y H ) < P R 



K 



i=l 



a 2 5>|g£ £ Fh 2i | 2 + 

4llgi;F|| 2 ) + <x 2 < (l + ^-)|gi;FH 1 b fe | 2 ,Vfc 

Although (ITU) is still a non-convex problem, we can use 
the observation made in [21j that any phase shift of b&, i.e., 
e j6> b/e, does not affect the optimality of the primal problem. 
Therefore, for any optimal solutions, there always exists a 
phase shift version of bk to make the term g^FHib^ real and 
positive while not affecting the value of the objective function 
and keeping the constraints satisfied. Thus, we can convert 
problem (1741) into the following equivalent form 



s.t. 



max a 
\\B\\ 2 F <a 2 P B , 



(15) 



llFHiBH 2 , 

+a 2 Tr(F(H 2 PP^Hf + a 2 R l M )F H ) < P R 

K 

^'g^PHib^ + a 



i=l 




sJ k Fh 2i \ 2 - 



4llg2 fe F|||) +*l < (1 + ^)(g^ fc FH 1 b fc ) 2 ,VA: 



real and >0 



where = [vec(B) T , a]. It is not hard to verify that (TT3T) is a 
standard second-order cone programming [ 22 ] and the optimal 



solution can be obtained by using available software package 
l23l . Then, dividing B by a, we finally get the optimal B. 

B. RS precoding 

In this subsection, we consider the precoding design at the 
RS with the BS precoder fixed. In the following, we first 
consider the precoding design for Total-MSE minimization, 
then extend it to sum rate maximization. 

1 ) Total-MSE minimization: The RS precoding to minimize 
Total-MSE can be formulated as: 



min Tr (E" 1 ) 

F V 7 



(16) 



s.t. 



t<Pr 

Ck > Afc, Vfe 



where E is defined in ©, r = Tr{F(HiBB^Hf 
3 2 PF 

Ck = 



H 2 PP^Hf + a 2 R I M )F H } and 



IgSbFHxbfcl 



2 • 



E^dg^FHib^ + iW&FhnP) + a%\\glF\\l + a, 

Note that the power constraint at the BS is irrelevant here since 
B is fixed. It is not hard to verify that the objective function 
and SINR constraints in (IT6l) are both non-convex. To make 
(IT6l) more tractable, we substitute the linear MMSE decoding 
matrix W back into (IT6l) and rewrite it as: 



s.t. 



min /(F,W) 

F,W J V ' 7 

t<Pr 

Ck > A*, Vfc 



(17) 



where 

/(F,W) =Tr{WGiFH 2 PP^HfF^GfW j:j + 

a^WGiFF^Gf + a 2 B WW H + Ik 
-WGiFH 2 P - P^Hf F^Gf W^} . 

(18) 

Note that (fT8l) can also be computed from ®. Although the 
two design matrices W and F are coupled together in (fTTh , the 
advantage of introducing W is that we can apply alternating 
optimization to solve two decoupled subproblems iteratively 
in what follows. 

In the alternating optimization, the first step is to update the 
BS decoding matrix W for a given F. From (TTTb , it is seen 
that the constraints are independent of W. Thus, the optimal 
W can be readily obtained as in (TTOb by equating the gradient 
of the objective function in (fTTl) to zero. 

Secondly, we need to optimize F with W fixed. This 
problem is equivalently rewritten as: 

min Tr {Gf W^WG X F (H 2 PP^Hf (19) 

F 

+o%I M ) F h - FH 2 PWGi 
-Gf W H F H n%F H + a 2 B WW H + I K } 
s.t. t<P r 

Ck > Afc, \/k 
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where we have used the fact that Tr(AB) = Tr(BA) for 
([T8l) . Although we can verify that the objective function in 
(|T9l) is convex based on ll24l . while due to the non-convex 
SINR constraints, the optimal F is still not easy to obtain. 
To proceed, we need to recast (|T9t into a suitable form such 
that efficient optimization tools can be applied. After certain 
transformation as detailed in Appendix IB problem (|T9l) can 
be rewritten into the following inhomogeneous quadratically 
constrained quadratic program (QCQP) form [22j: 

mm f^Qof-f^qo-q^f + (20a) 

s.t. f H QJ <P R (20b) 
f H Q k f > X k a 2 kJ Vfc (20c) 

where f, Qo, Q x and are defined in (l34b , (f36l) and 
(f38l) in Appendix IB respectively. By checking the positive 
semidefiniteness of Qo and the positive definiteness of Q^, 
we can verify that both the objective function (I20al) and 
the RS power constraint (I20bb are convex. However, the 
constraint (I20cb is not concave due to that defined in (f38t 
is not necessarily negative semidefinite. Hence, optimization 
problem (l20l) is non-convex. To solve (l20lh we rewrite (f20l) 
into a standard QCQP form as follows: 



mm 



QoXf 



(21) 



|^ = 1 

xf Q*x F < 

xf Q/cX F < 0,Vfc 



where x F = [t,f T ] T , Qo 



°M 2 xl 



OlxM 2 

Q 



Qo -q? 
-qo Qo 

^k&k °lxM 2 
°M 2 xl -Qfc 



Q, 



. Note that 



and Qk 

(IT9T) and (l2Tt are equivalent to each other. If we get an optimal 
solution of (l2Tt , we can always obtain an optimal solution of 
(l20b by selecting appropriate entries from xp/t no matter £ 
is real or complex. By a close inspection of (l2Tlh we find 
that (I2T1) can be transformed into the following semidefinite 
programming (SDP) form [22]: 



mm 

x F ^o 

s.t. 



Tr(Q X F ) 

Rank(X F ) = 1, 
Tr(QX F ) = 1 
Tr(Q,X F ) < 
Tr(Q fc X F ) <0,V& 



(22) 



where Q 



1 

0m 2 xi 



. Due to the rank-one con- 



OlxM 2 

L _._ . ._ 0m 2 xM 2 J 
straint, it is not easy to obtian an optimal solution of (l22l) . 

We therefore resort to relaxing it by deleting the rank-one 

constraint, namely, 



mm 

x F ^o 

s.t. 



Tr(Q X F ) 

Tr(QX F ) = 1 
Tr(Q,X F ) < 
Tr(Q/cX F ) < 0,Vfc 



(23) 



Note that (f23t is a standard SDP problem, thus its optimal 
solution can be easily obtained by using the available software 
package ll23l . If the optimal solution of (f23l) is rank-one, the 
optimal RS precoder can be obtained by using eigenvalue 
decomposition. Otherwise, certain techniques are required to 
find the optimal RS precoder. 

In what follows, we first consider a system with no more 
than two MSs (i.e., K < 2) for which an optimal solution of 
(f20l) can be obtained in most cases. Then, we extend the results 
to a more general system with K > 2 where the randomization 
technique is applied to find a quasi-optimal solution. 
a) K < 2: We first give the following theorem. 

Theorem 1: Suppose that the considered cellular MU- 
TWRS has at most two MSs, i.e., K < 2, an optimal rank-one 
solution of the non-convex optimization problem (l22b can be 
derived in polynomial time from the relaxed SDP problem 
(f23l) in the following cases: 1) problem (|23l) has an optimal 
rank-one solution; 2) problem (f23l) has at least one inactive 
constraint at the optimal solution; 3) problem (f23l) has an 
optimal solution of rank higher than two if all the constraints 
are active. 

Proof: Please refer to Appendix ■ 
From Theorem 7, we find that we cannot obtain an optimal 
rank-one solution if the SDP relaxation problem (l23l) happens 
to have an optimal solution of rank two with all the constraints 
being active. However, our simulations show that this case has 
rarely occurred. Nonetheless, we can propose a procedure of 
producing a suboptimal rank-one solution in Appendix [D] for 
that special case. 

Now, the iterative RS precoding algorithm to minimize 
Total-MSE for K < 2 can be outlined as follows. 

Algorithm 1 (RS precoding with K < 2) 

• Initialize F 

• Repeat 

- Update the BS decoding matrix W using ( Uol for a fixed F; 

- Update the RS precoder F with W fixed as follows: If the obtained 
Xi? in (23j is rank-one, using eigenvalue decomposition to get F. 
Otherwise, using the procedures presented in Appendix Icl or Id] to 
get F; 

• Until termination criterion is satisfied. 

Lemma 2: Algorithm 1 is convergent and the limit point of 
iteration is a stationary point of (fTTl) . 

Proof: Since for K < 2, the optimal solution in (TT9b 
can be obtained in most cases as claimed in Theorem 7, 
the solution in each iteration in Algorithm 1 can be viewed 
as being optimal. Thus the Total-MSE at the BS is strictly 
reduced after each iteration before convergence. On the other 
hand, the objective function is lower-bounded (at least zero). 
Therefore, we conclude that Algorithm 1 is convergent. We 
assume that the limit point of Algorithm 1 is {W, F}. At the 
limit point, the solution will not change if we continue the 
iteration. Otherwise, the Total-MSE can be further decreased 
and it contradicts the assumption of convergence. The optimal 
solution in each iteration further means that W and F are 
local minimizers of each subproblem. Hence, we have 

Tr{v w /(W;F) T (W-W)} > 0, 

Tr{v F /(F;W) T (F-F)} > 0, 
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Summing up the two inequalities, we get 

Tr{v x /(X) T (X-X)} >0, (24) 

where X = [W, F]. Condition (l24l) implies the stationarity of 
X in (IT71) (e.g., see Theorem 3 of ll25l ). ■ 
b) K > 2: Now we consider a more general case with 
K > 2. Since at least five constraints are contained in (l23k it 
is difficult to find an optimal rank-one solution if the optimal 
solution in (l23t has higher rank than one. Next we propose 
to apply the randomization technique in l26l to find a quasi- 
optimal rank-one solution of (f2Qb . We first transform (f20l) into 
the following equivalent form: 

mm Tr (q F) - f ^q - q?f + 4o (25) 

s.t. Tr (q,F) < 

Tr (q*f) > A^Vfc 
F = f x f H 

Relaxing the constraint F = f x f H to F > f x f H and 
applying the Schur complement theorem, we get the following 
optimization problem: 

min Tr (q f) - f ^q - f + qo (26) 

F,f V / 

s.t. Tr (Q,F) < P R 

Tr(Q fc F) > X k alWk 
F f 

Note that ([26b is convex, thus the obtained solution is optimal. 
If we generate enough samples of Gaussian variable x follow- 
ing £A/"(f , F — f x f H ) with F and f being an optimal solution 
of d26L and choose the best candidate x from the samples as 
a solution of d20l) , x will optimally solve (f20l) on average, i.e., 

mm £ (f ^Qof - f ^qo - qff + go) (27) 
si. £ (f^Qaf) < 

Finally, the proposed iterative algorithm for if > 2 is outlined 

as: 

Algorithm 2 (RS precoding with K > 2) 

• Initialize F 

• Repeat 

- Update the BS decoding matrix W using dTot for a fixed F; 

- Update the RS precoding matrix F with W fixed using the 
following steps: First, form an optimization problem as t23t , if 
the obtained F is rank-one, the optimal RS precoder is obtained 
by applying eigenvalue decomposition. Otherwise, apply the ran- 
domization procedures J25J-J26J to get a quasi-optimal solution; 

• Until termination criterion is satisfied. 

Note that although the obtained F from the second step in 
Algorithm 2 may not be optimal, our simulation results show 
that the obtained F by using randomization is always good 
enough to make the iteration convergent. 



2) Sum-rate maximization: Motivated by the relationship 
between sum rate and weighted MMSE in MIMO-BC system 
recently found in lETlL we next try to extend the proposed 
RS precoding design for Total-MSE minimization to sum rate 
maximization. The sum-rate maximization problem is re- stated 
as: 

max log 2 det (l K + P^Hf F^Gf (28) 

F 

(4GiFF H G? + ctIInT 1 GxFH 2 p) 

S.t. T<P R 

Ck > Afc, Vfc 

where the constraints are the same with (fT6l) . It is not hard 
to verify that (l28t is non-convex. To solve (l28lh we introduce 
the following lemma. 

Lemma 3: If a F satisfies the Karush-Kuhn-Tucker (KKT) 
conditions of (l28k it will also satisfy the KKT conditions of 
the following problem: 

mm Tr (AE -1 ) (29) 
s.t. r<P R 

Ck > Afc, Vfe 

where E is defined in ©, if the weight matrix A is set to 

A = J- (I K + P^Hf F^Gf (4GiFF H G?+ 

log^ (30 ) 

4I 7V )" 1 G 1 FH 2 P). 

Proof: The proof is similar to the MIMO BC precoding 
design problem in [27], thus we omit for brevity. ■ 
Lemma 3 implies that using the weight matrix A in (l30lh 
(l28l) shares the same stationary point with (l29l) . Then alternat- 
ing optimization can be used to get the final solution of (f28l) 
as in [27], which is presented as follows: 

Algorithm 3 (RS precoding for maximizing sum rate) 

• Initialize F 

• Repeat 

- Update the BS decoder matrix W using fTol for fixed F and A; 

- Update the weight matrix A using (f30l for fixed F and W; 

- Update the RS precoder matrix F as in Algorithm 1 or 2; 

• Until termination criterion is satisfied. 

According to the convergence analysis provided in [27], the 
convergence of Algorithm 3 can be ensured. 

C. Joint precoding 

Obviously, the previously presented two precoding designs 
can be combined to realize the joint BS-RS precoding design 
to obtain better performance. In this case, if the RS has enough 
capability to enable the joint design, it can collect all the 
required CSI and optimize F and B jointly. Then besides F, 
the RS should also broadcast B to the BS and MSs. On the 
other hand, the joint optimization can also be conducted at 
the BS and the RS helps to collect CSI and transmits them to 
the BS. Then, the BS needs to transmit B and F to the RS, 
and the RS further broadcasts them to the MSs. Nevertheless, 
such joint precoding design requires more feedback overheads 
although it leads to better performance. 
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According to the algorithms proposed in Subsections A and 
B, the joint precoding design is outlined as: 

Algorithm 4 (Joint precoding scheme) 

• Initialize B 

• Repeat 

- Update the RS precoder F for a fixed BS precoder B by using 
Algorithm 1 or 2 for Total-MSE minimization and Algorithm 3 
for sum rate maxmization; 

- Update the BS precoder B for a fixed relay precoder F by using 
the SOCP optimization as in Subsection A; 

• Until termination criterion is satisfied. 

Lemma 4: The proposed joint precoding design algorithm 
is convergent. 

Proof: For convenience of presentation, we take Total- 
MSE minimization as example. The proof can be easily 
extended to the case of sum rate maximization. Firstly, for 
a fixed F, updating B must decrease the Total-MSE at the BS 
by increasing a in (fT3t , otherwise, the BS precoder B should 
not be changed. Thus, we have 

e (B(n + 1), F(n)) < e (B(n), F(n)) , 

where n denotes the iteration index. Then, we apply the 
proposed RS precoding design to update F by initializing 
F = aF(n). Since the proposed iterative RS precoding 
design algorithm decreases Total-MSE after each iteration, we 
haveQ 

e (B(n + 1), F(n + 1)) < e (B(n + 1), F(n)) . 

Therefore, we conclude that the joint precoding design algo- 
rithm is convergent. ■ 

IV. Discussion on Signaling Overhead and design 
complexity 

As mentioned previously, each precoding design has its 
own merit. Choosing which precoding scheme is not only 
dependent on the processing capability of the BS and the 
RS, but also the design complexity and signaling overhead. In 
this section, we provide a comprehensive comparison between 
these designs. It is assumed that the channel characteristics 
of each link change slowly enough so that they can be per- 
fectly estimated by using pilot symbols or training sequences. 
Besides, the information of channel state and precoders can 
be exchanged accurately between the BS and the RS, the 
RS and the MSs through some lower rate auxiliary channels. 
For completeness, two transmission modes, i.e., time-division 
duplex (TDD) mode and frequency-division duplex (FDD) 
mode, are considered, respectively. The overall comparisons 
are presented in Table U where "Overhead-I" denotes the 
overhead used to feed back the CSI and "Overhead-II" denotes 
the overhead used to feed back the precoding information. 
Moreover, we suppose that the BS and MSs can estimate their 
local CSI Gi and \i2k, Vfc, respectively. 

Since the BS precoding design is a SOCP problem, accord- 
ing to (281, the design complexity can be approximated as 

jibs = (NK + l) 2 (K + 2)°- 5 (2NK + K 2 + 2K + 4) log(l/e), 

(31) 

! On the case of solving l20l through randomization at K > 3, if we 
cannot find a solution decreasing the objective value in J20j, we can just set 
F(n + 1) = aF(ra). 
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(a) N = 2,M = 2,K = 2 
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(b) N = 3,M = 3,K = 3 

Fig. 2. Checking the optimality of the RS precoding design at P = 5 dB 
and L = 5. 



where e denotes the solution accuracy. For the RS precoding 
design, the design complexity mainly comes from solving the 
SDP problem and using the randomization technique. Thus, 
according to l29ll , it can be approximated as 



n RS = Irs (max(M 2 , K + 2) 4 Mlog(l/e) 



n rd ) 



(32) 



where n r d denotes the complexity of randomization and Irs 
denotes the iteration number required in Algorithm 1, 2 or 
3. Note that when K < 2, n r d is equal to (assuming that 
the complexity of getting rank-one solution from higher rank 
one can be omitted). Combining (I3TT) and (l32b leads to the 
joint precoding design complexity given in Table U where lj 
denotes the iteration number needed in Algorithm 4. 

From Table H we find that the difference of signal overhead 
between the BS precoding and the RS precoding is not 
significant if they are designed at the same station and it 
depends on the antenna configuration of the system. In general, 
the BS precoding design has less design complexity compared 
with the RS precoding design. For each precoding design, it 
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TABLE I 

Signaling overhead and design complexity comparison 





TDD 


FDD 


Complexity 


Overhead-I 


Overhead-II 


Overhead-I 


Overhead-II 


m BS Precoding 
( ) ( Design at BS ) 


RS M> BS 
RS ^> MSs 


BS^RS 
RS MSs 


MSs RS 
RS h ^ g J^' Hl BS 
RS H ^? fc MS fc 


BS^RS 

_ _ B,»: , 

RS MSs 


0(n BS ) 


n BS Precoding 
{ } ( Design at RS ) 


same as (1) 


RS ^§ BS, MSs 


BS =^ RS 
MSs ^ RS 
RS h2fc ^ Hl BS 
RS H ^? fc MS k 


RS ^ BS, MSs 


0(n BS ) 


RS Precoding 
( ' ( Design at BS ) 


same as (1) 


BS RS 
RS MSs 


same as (l) 


BS =E> RS 
RS =l=> MSs 


0(n RS ) 


RS Precoding 
W ( Design at RS ) 


same as (1) 


RS ^ BS, MSs 


same as (2) 


RS BS, MSs 


0(n RS ) 


... Joint Precoding 
( } ( Design at BS ) 


same as (1) 


BS ^|RS 
RS MSs 


same as (l) 


bs^Irs 

RS ^| MSs 


0(lj(n B s +n RS )) 


_ Joint Precoding 
( } ( Design at RS ) 


same as (1) 


RS ^4 BS, MSs 


same as (2) 


RS ^ BS, MSs 


0(lj(n B s +n RS )) 



is more practical to perform it at the RS in order to save the 
signaling overhead consumption. 

V. Simulation results 

In this section, some numerical examples are presented to 
evaluate the proposed precoding designs. The channels are set 
to be Rayleigh fading, i.e., the elements of each channel matrix 
or vector are complex Gaussian random variables with zero 
mean and unit variance. We assume that the noise powers at 
all the destinations are the same, i.e., a B = o\ = g\ = 1, 
Vfc. The transmission power at all the MSs and RS are the 
same as Pr = = P, Vfc, and the transmission power 
at the BS is assumed to be Pb = LP where L is a 
constant. For all the simulations, 1000 channel realizations 
have been simulated. Moreover, 10000 quadrature-phase- shift 
keying (QPSK) symbols are transmitted from each source 
node for each channel realization when simulating bit-error- 
rate (BER) performance. For all comparisons, if not specified 
otherwise, the fixed RS procoder F in the BS precoding design 
is chosen as F = 1m and the fixed BS precoder B in the RS 
precoding design is chosen as B = ^/Pb/KInxK- 

In Fig. [21 we check the optimality of the proposed RS pre- 
coding design, Algorithm 1 and Algorithm 2, for Total-MSE 
minimization in Fig. |2(a)| and Fig. |2(b)| respectively, by trying 
different initialization points at three sets of given but arbitrary 
channel realizations. Specifically, for each channel realization, 
six different initialization points, including the identity matrix 
and five random matrices, are simulated. Moreover, for K = 3, 
we choose three channel realizations where the randomization 
technique is needed to find a quasi-optimal rank-one solution 
of ([20l). Fig. [2(a)] shows that Algorithm 1 for K = 2 can 
converge to a unique solution with any initialization points. 
Fig. |2(b)| shows that Algorithm 2 for K > 2 is also able 
to converge to the solutions which are close to each other 
with different initialization points. Thus we conclude that the 
proposed iterative RS precoding for Total-MSE minimization 
can indeed approach the optimal solution. 
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(a) Convergence behavior 
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(b) Complexity of randomization 

Fig. 3. Convergence behavior of the proposed iterative precoding design and 
complexity of randomization at P = 5 dB and L = 5. 
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(a) BER comparison 
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L=1 Proposed-RS-Rate 
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(b) Sum-rate comparison 

Fig. 4. Performance comparison for different precoding designs with N 
2,M = 2,K = 2. 



(b) Sum-rate comparison 

Fig. 5. Performance comparison for the BS and RS precoding designs with 
different antenna configuration (N, M) at L = 5 with K = 2. 



In Fig. |3(a)[ the convergence behavior of the proposed RS 
and joint precoding designs for Total-MSE minimization is 
shown as the function of iteration index at P = 5 dB and 
L = 5. We observe that the proposed RS precoding converges 
in 20 iterations for K = 2 and in 30 iterations for K = 3. 
Moreover, the proposed joint precoding algorithm converges 
within 10 iterations for both two and three MSfl Fig. |3(b)| 
illustrates the required random samples in solving (f20l) by 
using randomization to approach the lower bound obtained 
from (l26l) . We observe that as the number of the samples 
increases, a better solution can be obtained. But when the 
number exceeds 2000, the obtained solution does not change 
much, which further indicates that 2000 samples are enough 
in general to generate a near optimal solution. 

In Fig. |H we show the uplink BER and sum rate compar- 
isons of all the proposed precoding designs as the function of 
P for TV = 2, M = 2, K = 2 at L = 1 and L = 10 dB. Here 

2 Here, for the inner RS precoding design, we set the maximum iteration 
number as 20 for K = 2 and 30 for K = 3. 



the notation "-MSE" means that the precoding is designed 
based on the Total-MSE criterion, while "-Rate" means that the 
precoding is designed based on the sum rate criterion. For fair 
comparison and to make our optimization problems feasible, 
we set the SINR requirements in (Tf2l) as = Vfc where e& 
is the SINR at the MS k when no precoding is employed, i.e., 
both B and F are identity matrices. We observe that when the 
BS has the same power as the RS and MS, i.e., L = 1, the RS 
precoding design outperforms the the BS precoding design for 
both BER and sum rate comparison. When the BS has more 
power than the RS and MS, i.e., L = 10, the BS precoding 
can achieve better uplink performance than the RS precoding 
in certain SNR regime. The reason is that with more power 
at the BS, the interference observed at each MS is introduced 
mainly by the downlink transmission. Then the precoding at 
the BS becomes important in coordinating the interference, 
which makes the BS precoding more effective than the RS 
precoding to improve the uplink performance. 

Fig. [5] illustrates the BER and sum rate comparison for 
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(a) BER comparison 
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(b) Sum-rate comparison 

Fig. 6. Performance comparison with 1 15 1 at TV = 2, M = 2, K = 2 and 
L = 10. 



transmission. A well designed RS precoder can change the 
uplink channel matrix, not only the power. 

In Fig. [6l we compare the proposed precoding designs with 
the joint precoding design in lfT5l for K = 2 at L = 10. For 
fairness, we set the SINR requirements in (fT2l) as = e&, Vfc 
where is the SINR obtained by using the precoders obtained 
in fl5l . Specifically, the RS and BS precoders obtained from 
lfT5l are chosen as the fixed RS precoder in "Proposed-BS" and 
the fixed BS precoder in "Proposed-RS", respectively. Under 
this setup, we find that further optimizing the BS precoder or 
the RS precoder can obtain more performance gain over fT5l . 
Fig. [6] also shows that the RS precoding can get most of the 
performance gain of the joint precoding, which implies that 
the obtained ZF BS precoding in [15 ] is indeed a good choice 
for improving the system performance. 

VI. Conclusions 

In this paper, we studied linear precoding designs for 
multiuser two-way relay systems in a cellular network for 
maximizing the uplink performance while maintaining the 
downlink QoS requirements. Three precoding schemes were 
considered, namely, the BS precoding, the RS precoding and 
the joint BS-RS precoding. By recasting the precoding designs 
into suitable forms, we obtained the optimal solution for the 
BS precoding and the local optimal solutions for both the RS 
precoding and the joint BS-RS precoding. The performance of 
these precoding designs were compared and some practical im- 
plementation issues were discussed. Simulation results showed 
that the RS precoding design is more efficient than the BS 
precoding design in most cases. The results also demonstrated 
the superiority of the proposed precoding designs over existing 
ones. 



Appendix A 
Proof of lemma 1 



different BS and RS antenna configuration (N,M) at K — 2 
with total number of the BS and RS antennas being fixed at 
TV + M = 6. For fair comparison, the target SINR at each 
MS is set as A/c = — 5dB, Vfc and the uplink performance 
is averaged over the cases where the BS and RS precoding 
designs are feasible. We see that when the BS has more 
antennas than the RS, i.e., at (4, 2), the BS precoding performs 
better than the RS precoding. The reason is that increasing 
the number of the BS antennas is not only helpful for the 
BS precoding, but also helpful for the decoding of the uplink 
transmission. However, when the RS has more antennas than 
the BS, the system performance can be significantly enhanced 
and the RS precoding greatly outperforms the BS precoding. 
This indicates that the antennas are more useful at the RS, 
while not at the BS. This is because the BS precoding just 
makes an effort to let the downlink use less RS power to satisfy 
the SINR requirements at the MSs, and then more RS power 
can be allocated for the uplink to improve the performance. 
However, the RS precoding is directly relevant to the uplink 



To prove Lemma 7, we only need to verify that functions 

h(P) = Tr (E(/?)" 1 ) and f 2 (0) = log 2 det (E(/?)) with 



E(/3) =I K + P H H$ F H Gf 

(4G!FF ff Gf + /^ll^v)" 1 G!FH 2 P, 

where (3 = 1/a 2 , are monotonically increasing and decreasing 
with respect to 0, respectively. To this end, we have 



= Tr -EOS)" 1 



dp 

d(P H H% F^Gf (crlGiFF^Gf + l3a 2 B I N )- 1 G 1 FH 2 P) 



dp 



EOS)" 1 



Tr (crlE^-^^Hf F ff Gf R-^iFHsPE^)" 1 ) 



>0, 
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<W) 

d/3 log 

d(P H U§ F^Gf (crlGiFF^Gf + /^lljv^GiFHaP) 



d/3 



^-Tr (alE^-^^Hf F^Gf R" 2 GiFH 2 p) 



log 

<0, 

where R = a|GiFF^Gf + f3(J 2 B l N - For both inequalities, 
we have used the fact that both E and R are positive definite. 
Thus, the proof is completed. 

Appendix B 
Transformations from ([19]) to (l20t 

We first rewrite the objective function /(F, W) in (TT91) as 

/(F, W) = f *Q f - f ^q - q?f + go, (33) 

where q vec(Gf Wf P^Hf ), g 

Tr (ct^WW^ + Ik) and 

f = vec(F), 

Qo = (H 2 PP^Hf + a 2 R l M ) T ® (GfW ff WGi) . ^ 

Here the second and third terms in (f33t are obtained from the 
corresponding terms of the objective function in (TT91) by using 
the rule Tr(A T B) = (vec(A)) T vec(B) |30|. The first term 
of (f33t is the reformulation of the first term of the objective 
function in (TT91) by using the rule ll30l 

Tr (ABCD) = (vec(D T )) T (C T A) vec(B). (35) 

Again according to (l35k the relay power constraint r < Pr 
in (TT91) can be re-expressed as 



where 

= (HiBB^Hf + H 2 PP^Hf + a 2 R I M ) T ®I M - (36) 

The SINR constraint > in (TT91) is equivalent to, by 
simple manipulations 



Tr^g^F^Hib.bfHf- 

A fc ( ^(Hibibf Hf + Pih 2 ihg) + 41m)) ? H ) > \ k <rl 

(37) 

By using (I35K inequality (l3Tb can be rewritten as 
where 

Q fc = ( Hib fc bf Hf- 



A fe ( ^(Hibibf Hf + P,h 2l h£) + 4l M ) ) ® (g 2 * fe g^) • 

(38) 

Finally, (IT9t can be readily written into a form as (l20l) . 



Appendix C 
Proof of Theorem 1 

Note that if K = 1, the optimal rank-one solution can be 
obtained as claimed in Lemma 3.1 given in (3TJ, here we omit 
it for brevity. On the case where (l23t has an optimal rank-one 
solution, it is indeed the optimal solution of (l22l) . Next we 
focus on the case K = 2 and the rank of the optimal solution 
of (l23t is higher than one. Since the optimization problem (l23l) 
is convex, the sufficient and necessary optimality conditions 
(or termed as complementary slackness condition) are 



y k Tr (q*X f ) = 0, y k > 0, k = 1, 2 

2/ 3 Tr (Q,X F ) = 0, 2/4 (TV (QX F ) - 1) = 0, ( 39 ) 

!/3>0,1/4GR 

where yi, for z = 1, 2, 3, 4, are dual variables and 



Tr(ZX F ) =0 



(40) 



with Z = Qo + ^iQi+^2Q2 + 2/3Q* + 2/4Q >r 0. To proceed, 
we assume that = Tr(Q^X F ), i = 1, 2, x. 

We first consider the case where at least one inequality 
constraint in (l23t is inactive, i.e., at least one < 0. Suppose 
that the rank of the obtained X F in (l23t is and it can 
be decomposed as X F = VV^ with V e C^ m2+1 ^ xR . By 
applying the trick used in [31], we introduce a Hermitian 
matrix M to satisfy 

Tr (v ff Q fc VM) = 0, Tr (v^Q^Vm) =0, k = 1, 2, 

(41) 

where M G (C RxR has i? 2 real elements. If i? 2 > 3, there 
always exists a nonzero solution M satisfying (I4TT) . Let ^, for 
z = 1, 2, . . . , R, be the eigenvalues of M and define \5q\ = 
max{|^|,Vi}. Then, we get X F = V (I R - (1/S )M)V H 
and further set X F = X F /a with a — X F (1,1). Here we 
note that a > due to the fact that X F is positive semidefinite 
and Qk is positive definite. It is not hard to see that the 
rank of X F is reduced by at least one. We next verify that 
X F is still an optimal solution of (l23t . First, we check the 
primal feasibility of X F . With X F (1,1) = 1, the condition 
Tr ^QX F ^j = 1 is satisfied. Moreover, since Tr ^Q^X F 

Tr (Q,X F 

Tr(QiX F j < 0,i = 1,2 are also satisfied. Second, we 
need to check the complementary conditions in (|39t and (l4Qb . 
It is found that if Tr ^Q^X F ^ = 0, i = l,2,x, we must 

have Tr (q*X^) = 0,i = 1,2, a?. Then Tr (QiX F ) = 0, 
i = l,2,x, succeed, which means that (|39t is satisfied. On 
the other hand, if Tr ^Q^X F ^ ^ 0, i = 1, 2, x, it means that 
yi = 0. Then dividing X F by a does not affect the satisfaction 
of (HU). For (HO)), since Tr (zX F ) = Tr(ZX F ) = 0, 

Tr (ZX F ) must be equal to zero. Thus X F also satisfies the 

condition in (|4Qb . Therefore, X F is also an optimal solution 
of (l23t . Repeat the above procedure until R 2 < 3, then an 
optimal rank-one solution is obtained. For completeness, we 
present the detailed procedures as follows: 



,i = 1,2, a: and a > 0, Tr (q. i; X f ) < and 
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• Solve optimization problem (23} and get the optimal solution with 
rank R; 

• Repeat 

- Decompose Xi? as Xi? = VV H ; 

- Find a nonzero R x R Hermitian solution M of the following 
linear equations Tr (V^Q;Vm) =0, 2=1,2, x; 

- Evaluate the eigenvalues 61,62,- •• ,6r of M and set \6q\ = 
max{|(5i|, Vi}; 

- Compute X^ = V (I K - (1/<5 )M) and further get X^ = 
X.' F /a with a = X^(l, 1). 

- Set X F = X^. 

• Until the rank R = Rank(Xi?) is equal to 1. 

Then we consider the case where all the inequality con- 
straints are active, i.e., = 0, for i = 1,2, x. Note that 
since M 2 + 1 > 4, the size of matrix in (|23t is always 
larger than four. Suppose R > 3. Based on Theorem 2.1 
given in |[32lL we obtain that there is a rank-one decomposition 
for Xi? (synthetically denoted as T>s(X^F: Qi, Q2, Q;c))> i-^., 
X F = J]f =1 x r x^, such that 



and 



Tr(Q fc X F ) 



0, fc = 1,2, r = 1,2,--- 



Tt(Q*X F ) 



0, r = 1,2,-. • ,R-2. 



By generating X^ = xix^ and X^ = X^/X^(l,l) 
(we again note that X F (1,1) > 0), it is easy to check 
X F is feasible for (f23l) and satisfies the optimality condi- 
tions ([39b and (l40l) together with the optimal dual solution 
{2/1? 2/2? 2/3? 2/4}- Therefore, X F can be regarded as an optimal 
rank-one solution of d23i 

Appendix D 

Procedure to get a suboptimal rank-one solution 

If (f23l) has an optimal solution of rank two with all the 
constraints being active, we next give a method to obtain a 
good feasible solution. Let the optimal solution in (f23l) be in 
1 " 
z X 



the form 



. We have 



«fc = Tr(Q fe X F ) = X k a 2 k - Tr(Q fe X), k = 1, 2, 
a* = Tr(Q,X F ) = + Tr(Q,X). 



That is, 

A = Tr(Q fc X) = A fc aj? - a k ,k = 1, 2, 
A* = Tr(Q,X) = a x +P R . 

Then we have Tr((Q& — |^Q X )X) = 0. Again according to 
Theorem 2.1 given in (33), we obtain that there is a rank-one 
matrix decomposition (synthetically denoted as ^(Xp, Qi — 

f^Q*, Q2 - f^Q*)) X = Eli = Rank(X) < R) 

such that 

f r H (Qfc-f^Q^fr = 0, fc = l,2, r=l,2,-.- ,5. 



We take fi and set 7 



_ Ar 



. It can be verified that 



(V7fi) H Q fc (V7fi) = At, for fe = 1,2, a, and that X' = 
Xix^ with xi = [1, (-^yfi)^]^ is feasible for (f23l) and can 
be regarded as a suboptimal rank-one solution of d23t . 
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